Multi-input Multi-output Beta Wavelet Network: Modeling of Acoustic Units for Speech Recognition
نویسندگان
چکیده
In this paper, we propose a novel architecture of wavelet network called Multi-input Multi-output Wavelet Network MIMOWN as a generalization of the old architecture of wavelet network. This newel prototype was applied to speech recognition application especially to model acoustic unit of speech. The originality of our work is the proposal of MIMOWN to model acoustic unit of speech. This approach was proposed to overcome limitation of old wavelet network model. The use of the multi-input multi-output architecture will allows training wavelet network on various examples of acoustic units. Keywords-wavelet network; multi-input multi-output wavelet network MIMOWN; speech recognition; modeling of acoustic units; wavelet network.
منابع مشابه
Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملData Augmentation Using Multi-Input Multi-Output Source Separation for Deep Neural Network Based Acoustic Modeling
We investigate the use of local Gaussian modeling (LGM) based source separation to improve speech recognition accuracy. Previous studies have shown that the LGM based source separation technique has been successfully applied to the runtime speech enhancement and the speech enhancement of training data for deep neural network (DNN) based acoustic modeling. In this paper, we propose a data augmen...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملFeature mapping using far-field microphones for distant speech recognition
Acoustic modeling based on deep architectures has recently gained remarkable success, with substantial improvement of speech recognition accuracy in several automatic speech recognition (ASR) tasks. For distant speech recognition, the multi-channel deep neural network based approaches rely on the powerful modeling capability of deep neural network (DNN) to learn suitable representation of dista...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1211.2007 شماره
صفحات -
تاریخ انتشار 2012